AITopics

2509.18723

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Turcato, Niccolò, Libera, Alberto Dalla, Giacomuzzo, Giulio, Carli, Ruggero, Romeres, Diego

Learning control of underactuated double pendulum with Model-Based Reinforcement Learning

arXiv.org Artificial IntelligenceSep-9-2024

This report describes our proposed solution for the second AI Olympics competition held at IROS 2024. Our solution is based on a recent Model-Based Reinforcement Learning algorithm named MC-PILCO. Besides briefly reviewing the algorithm, we discuss the most critical aspects of the MC-PILCO implementation in the tasks at hand.

algorithm, controller, mc-pilco, (13 more...)

2409.05811

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Italy (0.04)
Europe > Germany > Bremen > Bremen (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

arXiv.org Artificial IntelligenceMay-30-2023

Learning Control by Iterative Inversion

Leibovich, Gal, Jacob, Guy, Avner, Or, Novik, Gal, Tamar, Aviv

We propose $\textit{iterative inversion}$ -- an algorithm for learning an inverse function without input-output pairs, but only with samples from the desired output distribution and access to the forward function. The key challenge is a $\textit{distribution shift}$ between the desired outputs and the outputs of an initial random guess, and we prove that iterative inversion can steer the learning correctly, under rather strict conditions on the function. We apply iterative inversion to learn control. Our input is a set of demonstrations of desired behavior, given as video embeddings of trajectories (without actions), and our method iteratively learns to imitate trajectories generated by the current policy, perturbed by random exploration noise. Our approach does not require rewards, and only employs supervised learning, which can be easily scaled to use state-of-the-art trajectory embedding techniques and policy representations. Indeed, with a VQ-VAE embedding, and a transformer-based policy, we demonstrate non-trivial continuous control on several tasks. Further, we report an improved performance on imitating diverse behaviors compared to reward based methods.

machine learning, reinforcement learning, trajectory, (16 more...)

2211.01724

Country:

Asia > Middle East > Israel > Haifa District > Haifa (0.04)
North America > United States > New Jersey > Hudson County > Hoboken (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Neural Information Processing SystemsApr-6-2023, 19:11:28 GMT

A Practice Strategy for Robot Learning Control

"Trajectory Extension Learning" is a new technique for Learning Control in Robots which assumes that there exists some parameter of the desired trajectory that can be smoothly varied from a region of easy solvability of the dynamics to a region of desired behavior which may have more difficult dynamics. By gradually varying the parameter, practice movements remain near the desired path while a Neural Network learns to approximate the inverse dynamics. For example, the average speed of motion might be varied, and the in(cid:173) verse dynamics can be "bootstrapped" from slow movements with simpler dynamics to fast movements. This provides an example of the more general concept of a "Practice Strategy" in which a se(cid:173) quence of intermediate tasks is used to simplify learning a complex task. I show an example of the application of this idea to a real 2-joint direct drive robot arm.

learning control, practice strategy, robot learning control

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.30)

Neural Information Processing SystemsApr-6-2023, 19:03:48 GMT

Learning Control Under Extreme Uncertainty

A peg-in-hole insertion task is used as an example to illustrate the utility of direct associative reinforcement learning methods for learning control under real-world conditions of uncertainty and noise. Task complexity due to the use of an unchamfered hole and a clearance of less than 0.2mm is compounded by the presence of positional uncertainty of magnitude exceeding 10 to 50 times the clearance. Despite this extreme degree of uncertainty, our results indicate that direct reinforcement learning can be used to learn a robust reactive control strategy that results in skillful peg-in-hole insertions.

extreme uncertainty, learning control, reinforcement, (1 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Neural Information Processing SystemsApr-6-2023, 14:23:10 GMT

Bayesian Kernel Shaping for Learning Control

In kernel-based regression learning, optimizing each kernel individually is useful when the data density, curvature of regression surfaces (or decision boundaries) or magnitude of output noise (i.e., heteroscedasticity) varies spatially. Unfortunately, it presents a complex computational problem as the danger of overfitting is high and the individual optimization of every kernel in a learning system may be overly expensive due to the introduction of too many open learning parameters. Previous work has suggested gradient descent techniques or complex statistical hypothesis methods for local kernel shaping, typically requiring some amount of manual tuning of meta parameters. In this paper, we focus on nonparametric regression and introduce a Bayesian formulation that, with the help of variational approximations, results in an EM-like algorithm for simultaneous estimation of regression and kernel parameters. The algorithm is computationally efficient (suitable for large data sets), requires no sampling, automatically rejects outliers and has only one prior to be specified.

bayesian kernel shaping, learning control, regression, (2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.62)

Ronzani, Daniele, Mamedov, Shamil, Swevers, Jan

Vibration Free Flexible Object Handling with a Robot Manipulator Using Learning Control

arXiv.org Artificial IntelligenceNov-20-2022

Many industries extensively use flexible materials. Effective approaches for handling flexible objects with a robot manipulator must address residual vibrations. Existing solutions rely on complex models, use additional instrumentation for sensing the vibrations, or do not exploit the repetitive nature of most industrial tasks. This paper develops an iterative learning control approach that jointly learns model parameters and residual dynamics using only the interoceptive sensors of the robot. The learned model is subsequently utilized to design optimal (PTP) trajectories that accounts for residual vibration, nonlinear kinematics of the manipulator and joint limits. We experimentally show that the proposed approach reduces the residual vibrations by an order of magnitude compared with optimal vibration suppression using the analytical model and threefold compared with the available state-of-the-art method. These results demonstrate that effective handling of a flexible object does not require neither complex models nor additional instrumentation.

artificial intelligence, machine learning, vibration, (13 more...)

2211.11076

Country:

Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)

Genre: Research Report > Promising Solution (0.48)

Industry: Materials (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Ting, Jo-anne, Kalakrishnan, Mrinal, Vijayakumar, Sethu, Schaal, Stefan

Bayesian Kernel Shaping for Learning Control

Neural Information Processing SystemsFeb-15-2020, 03:44:16 GMT

In kernel-based regression learning, optimizing each kernel individually is useful when the data density, curvature of regression surfaces (or decision boundaries) or magnitude of output noise (i.e., heteroscedasticity) varies spatially. Unfortunately, it presents a complex computational problem as the danger of overfitting is high and the individual optimization of every kernel in a learning system may be overly expensive due to the introduction of too many open learning parameters. Previous work has suggested gradient descent techniques or complex statistical hypothesis methods for local kernel shaping, typically requiring some amount of manual tuning of meta parameters. In this paper, we focus on nonparametric regression and introduce a Bayesian formulation that, with the help of variational approximations, results in an EM-like algorithm for simultaneous estimation of regression and kernel parameters. The algorithm is computationally efficient (suitable for large data sets), requires no sampling, automatically rejects outliers and has only one prior to be specified.

bayesian kernel shaping, learning control, regression, (2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.62)

Neural Information Processing SystemsDec-31-1993

A Practice Strategy for Robot Learning Control

Sanger, Terence D.

"Trajectory Extension Learning" is a new technique for Learning Control in Robots which assumes that there exists some parameter of the desired trajectory that can be smoothly varied from a region of easy solvability of the dynamics to a region of desired behavior which may have more difficult dynamics. By gradually varying the parameter, practice movements remain near the desired path while a Neural Network learns to approximate the inverse dynamics. For example, the average speed of motion might be varied, and the inverse dynamics can be "bootstrapped" from slow movements with simpler dynamics to fast movements. This provides an example of the more general concept of a "Practice Strategy" in which a sequence of intermediate tasks is used to simplify learning a complex task. I show an example of the application of this idea to a real 2-joint direct drive robot arm.

inverse dynamic, practice strategy, trajectory, (14 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.15)
North America > United States > New Jersey (0.04)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.38)

Neural Information Processing SystemsDec-31-1993

Learning Control Under Extreme Uncertainty

Gullapalli, Vijaykumar

A peg-in-hole insertion task is used as an example to illustrate the utility of direct associative reinforcement learning methods for learning control under real-world conditions of uncertainty and noise. Task complexity due to the use of an unchamfered hole and a clearance of less than 0.2mm is compounded by the presence of positional uncertainty of magnitude exceeding 10 to 50 times the clearance. Despite this extreme degree of uncertainty, our results indicate that direct reinforcement learning can be used to learn a robust reactive control strategy that results in skillful peg-in-hole insertions.

gullapalli, insertion, reinforcement, (14 more...)